11 research outputs found
Método de agrupamiento no supervisado para el procesamiento del lenguaje natural utilizando medidas de similitud asimétricas y propiedades paradigmáticas
Una de las tareas más comunes para el ser humano, pero de con una alta complejidad es la agrupación y clasificación. Por otro lado, la debilidad del ser humano es la capacidad de procesar altas cantidades de datos y de forma rápida, caracterÃstica propia de los computadores. Hoy en dÃa se generan grandes cantidades de datos en el Internet, datos de distintos tipos y con diferentes objetivos. Para esto se necesitan de algoritmos de agrupación que nos permitan identificar los distintos grupos y caracterÃsticas de estos grupos, de forma automática sin conocimiento previo. Por otro lado, es importante definir con claridad qué medida de similitud se utilizará en el proceso de agrupación, la gran mayorÃa de las medidas de agrupación se enfocan en un aspecto simétrico. En la presente tesis se propone una novedosa medida de similitud asimétrica, Coeficiente d Similitud Unilateral Jaccard (uJaccard), similitud no es igual entre dos objetos uJaccard(a,b) ≠uJaccard(b,a). Asà también se presenta una similitud asimétrica con pesos Coeficiente Ponderado de Similitud Unilateral Jaccard, la cual mide el nivel de incertidumbre entre dos objetos. Asà también en esta tesis se propone una nueva propiedad de grafos, la propiedad paradigmática la cual considera la equivalencia regular como caracterÃstica fundamental y por último se propone un algoritmo de agrupación PaC, por sus siglas en inglés Paradigmatic Clustering, el cual incorpora la uJaccard y la propiedad paradigmática. Se ha realizado evaluaciones extensivas con datos pequeños, reales, sintéticos y se ha procesado 3 grandes corpus. Se ha demostrado que PaC es un algoritmo que sobre pasa los resultados de algoritmos de agrupación del estado del arte. Más aun PaC es un algoritmo capas de ser ejecutado de forma paralela, distribuida, incremental y en flujo, caracterÃsticas que se necesitan para el procedimiento de grandes cantidades de datos y de constante generación de dato
Psychometric computational thinking test
The recent widespread popularity of computational thinking (CT) has raised the need for a reliable method for assessing it. Recent CT tests focus on programming skills rather than the analytical ability and problem-solving processes in science, philosophy and other areas of knowledge. This poster presents the results (Test design) of an ongoing project that has developed a Psychometric Computational Thinking Test (PCTT) which has three phases: test design, test implementation and applying the test. In regards to the PCTT design, the reliability and validity of the test were based on content and construct validity which also includes its rating scales for its application. This work makes two contributions: (1) a standardized CT Test design incorporating psychometric techniques as well as computational techniques and (2) the inclusion of open-ended questions and their assessment with V of Aiken in order to validate responses. © 2018 Copyright held by the owner/author(s).Trabajo de investigació
Unilateral Weighted Jaccard Coefficient for NLP
Similarity measures are essential to solve many pattern recognition problems such as classification, clustering, and retrieval problems. Various similarity measures are categorized in both syntactic and semantic relationships. In this paper we present a novel similarity, Unilateral Weighted Jaccard Coefficient (uwJaccard), which takes into consideration not only the space among two points but also the semantics among them in a distributional semantic model, the Unilateral Weighted Jaccard Coefficient provides a measure of uncertainty which will be able to measure the uncertainty among sentences such as "man bites dog" and "dog bites man". © 2015 IEEE.Trabajo de investigació
Design of network infrastructure of a cloud data center for use in health sector
This article presents the design of the network infrastructure of a Data Center that meets the requirements arising from Cloud Computing, for use in the Health Sector of Arequipa city, focusing on network layer 2 and its dimensionality to meet the requirements of several health service applications. The network infrastructure dimensionality calculation is a complex challenge for an of the ground project, in this article we present a novel approach to solve this challenge. Copyright © 2015 for the individual papers by the papers' authors.Trabajo de investigació
Paradigmatic Clustering for NLP
How can we retrieve meaningful information from a large and sparse graph?. Traditional approaches focus on generic clustering techniques and discovering dense cumulus in a network graph, however, they tend to omit interesting patterns such as the paradigmatic relations. In this paper, we propose a novel graph clustering technique modelling the relations of a node using the paradigmatic analysis. We exploit node's relations to extract its existing sets of signifiers. The newly found clusters represent a different view of a graph, which provides interesting insights into the structure of a sparse network graph. Our proposed algorithm PaC (Paradigmatic Clustering) for clustering graphs uses paradigmatic analysis supported by a asymmetric similarity, in contrast to traditional graph clustering methods, our algorithm yields worthy results in tasks of word-sense disambiguation. In addition we propose a novel paradigmatic similarity measure. Extensive experiments and empirical analysis are used to evaluate our algorithm on synthetic and real data. © 2015 IEEE.Trabajo de investigació
Complete cone symmetric temporary NAT
The Network Address Translation (NAT) is a mechanism used almost for every user on the internet, primarily to alleviate the exhaustion of IPv4 address space by allowing multiple hosts to share a public/Internet address. The NAT allow to establish TCP communications if the communication start from internal NAT, but does not allow communication if it start from the public internet, external NAT. This is call The NAT traversal problems. It cause that communications among peers relay on a third intermediary computer for the whole communication. Been this a security issue as the third intermediary can get a copy of the communication and also make the communication slower as it need to go through the third computer. This is the case for any p2p, VoIP, live games among others internet applications. In this article we present a novel mechanism to establish a communication among peers in which peers are behind a NAT without using a third intermediary for the whole communication. © 2016 IEEE.Trabajo de investigació
Clustering algorithm based on asymmetric similarity and paradigmatic features
Similarity measures are essential to solve many pattern recognition problems such as classification, clustering, and information retrieval. Various similarity measures are categorised in both syntactic and semantic relationships. In this paper, we present a novel similarity, unilateral Jaccard similarity coefficient (uJaccard), which does not only take into consideration the space among two points but also the semantics among them. How can we retrieve meaningful information from a large and sparse graph? Traditional approaches focus on generic clustering techniques for network graph. However, they tend to omit interesting patterns such as the paradigmatic relations. In this paper, we propose a novel graph clustering technique modelling the relations of a node using the paradigmatic analysis. Our proposed algorithm paradigmatic clustering (PaC) for graph clustering uses paradigmatic analysis supported by an asymmetric similarity using uJaccard. Extensive experiments and empirical analysis are used to evaluate our algorithm on synthetic and real data. Copyright © 2016 Inderscience Enterprises Ltd.Trabajo de investigació
Comparing topics in CS syllabus with topics in CS research
This study quantifies and compares the computer security themes found in the ACM Computer Science curricula with the themes addressed in top-ranked computer security re- search conferences over the past six years. On the under- standing that current research should help set the agenda for course coverage, we use a strategic diagram to compare the research topics with the curriculum topics and identify specific future directions for the ACM CS curriculum and for computer security courses.Trabajo de investigació
Unilateral Jaccard similarity coefficient
Similarity measures are essential to solve many pattern recognition problems such as classification, clustering, and retrieval problems. Various similarity measures are categorized in both syntactic and semantic relationships. In this paper we present a novel similarity, Unilateral Jaccard Similarity Coefficient (uJaccard), which doesn't only take into consideration the space among two points but also the semantics among them. Copyright © 2015 for the individual papers by the papers' authors.Trabajo de investigació
AL-DDoS attack detection optimized with genetic algorithms
Application Layer DDoS (AL-DDoS) is a major danger for Internet information services, because these attacks are easily performed and implemented by attackers and are difficult to detect and stop using traditional firewalls. Managing to saturate physically and computationally the information services offered on the network. Directly harming legitimate users, to deal with this type of attacks in the network layer previous approaches propose to use a configurable statistical model and observed that when being optimized in various configuration parameters Using Genetic Algorithms was able to optimize the effectiveness to detect Network Layer DDoS (NL-DDoS), however this method is not enough to stop DDoS at the level of application because this level presents different characteristics, that is why we propose a new method Configurable and optimized for different scenarios of Attacks that effectively detect AL-DDoS. © Springer Nature Switzerland AG 2018.Trabajo de investigació